Goto

Collaborating Authors

 Ucayali Department


RevoNAD: Reflective Evolutionary Exploration for Neural Architecture Design

Chang, Gyusam, Yoon, Jeongyoon, yi, Shin han, Lee, JaeHyeok, Jang, Sujin, Kim, Sangpil

arXiv.org Artificial Intelligence

Recent progress in leveraging large language models (LLMs) has enabled Neural Architecture Design (NAD) systems to generate new architecture not limited from manually predefined search space. Nevertheless, LLM-driven generation remains challenging: the token-level design loop is discrete and non-differentiable, preventing feedback from smoothly guiding architectural improvement. These methods, in turn, commonly suffer from mode collapse into redundant structures or drift toward infeasible designs when constructive reasoning is not well grounded. We introduce RevoNAD, a reflective evolutionary orchestrator that effectively bridges LLM-based reasoning with feedback-aligned architectural search. First, RevoNAD presents a Multi-round Multi-expert Consensus to transfer isolated design rules into meaningful architectural clues. Then, Adaptive Reflective Exploration adjusts the degree of exploration leveraging reward variance; it explores when feedback is uncertain and refines when stability is reached. Finally, Pareto-guided Evolutionary Selection effectively promotes architectures that jointly optimize accuracy, efficiency, latency, confidence, and structural diversity. Across CIFAR10, CIFAR100, ImageNet16-120, COCO-5K, and Cityscape, RevoNAD achieves state-of-the-art performance. Ablation and transfer studies further validate the effectiveness of RevoNAD in allowing practically reliable, and deployable neural architecture design.


Register Any Point: Scaling 3D Point Cloud Registration by Flow Matching

Pan, Yue, Sun, Tao, Zhu, Liyuan, Nunes, Lucas, Armeni, Iro, Behley, Jens, Stachniss, Cyrill

arXiv.org Artificial Intelligence

Point cloud registration aligns multiple unposed point clouds into a common frame, and is a core step for 3D reconstruction and robot localization. In this work, we cast registration as conditional generation: a learned continuous, point-wise velocity field transports noisy points to a registered scene, from which the pose of each view is recovered. Unlike previous methods that conduct correspondence matching to estimate the transformation between a pair of point clouds and then optimize the pairwise transformations to realize multi-view registration, our model directly generates the registered point cloud. With a lightweight local feature extractor and test-time rigidity enforcement, our approach achieves state-of-the-art results on pairwise and multi-view registration benchmarks, particularly with low overlap, and generalizes across scales and sensor modalities. It further supports downstream tasks including relocal-ization, multi-robot SLAM, and multi-session map merging.


TESSERA: Temporal Embeddings of Surface Spectra for Earth Representation and Analysis

Feng, Zhengpeng, Atzberger, Clement, Jaffer, Sadiq, Knezevic, Jovana, Sormunen, Silja, Young, Robin, Lisaius, Madeline C., Immitzer, Markus, Jackson, Toby, Ball, James, Coomes, David A., Madhavapeddy, Anil, Blake, Andrew, Keshav, Srinivasan

arXiv.org Artificial Intelligence

Satellite Earth-observation (EO) time series in the optical and microwave ranges of the electromagnetic spectrum are often irregular due to orbital patterns and cloud obstruction. Compositing addresses these issues but loses information with respect to vegetation phenology, which is critical for many downstream tasks. Instead, we present TESSERA, a pixel-wise foundation model for multi-modal (Sentinel-1/2) EO time series that learns robust, label-efficient em-beddings. During model training, TESSERA uses Barlow Twins and sparse random temporal sampling to enforce invariance to the selection of valid observations. W e employ two key regularizers: global shuffling to decorrelate spatial neighborhoods and mix-based regulation to improve invariance under extreme sparsity. W e find that for diverse classification, segmentation, and regression tasks, TESSERA embeddings deliver state-of-the-art accuracy with high label efficiency, often requiring only a small task head and minimal computation. T o democratize access, adhere to F AIR principles, and simplify use, we release global, annual, 10m, pixel-wise int8 embeddings together with open weights/code and lightweight adaptation heads, thus providing practical tooling for large-scale retrieval and inference at planetary scale. The model training/inference code, downstream task code, and pre-generated embeddings can be accessed at https://github.com/ucam-eo.





GENIE: Higher-Order Denoising Diffusion Solvers Tim Dockhorn

Neural Information Processing Systems

A crucial drawback of DDMs is that the generative ODE or SDE is typically difficult to solve, due to the complex score function. Therefore, efficient and tailored samplers are required for fast synthesis.


Supplementary Material for Curious Exploration via Structured World Models Yields Zero-Shot Object Manipulation A GNN Architectural Details

Neural Information Processing Systems

We use message-passing GNNs in CEE-US as described in Sec. In a generalized setup, the sum in Eq. 1 can be replaced with another permutation-invariant function In this section, we provide experimental details and hyperparameter settings. Note that we overload the superscript to both indicate ensemble members' predictions and object-centric Note that model learning only occurs during the intrinsic phase. C.3.1 Details on Downstream T asks and Reward Functions We use the notation introduced in Sec. The actuated agent, i.e. robot, state is given by Each goal site is at least 0.16 and at most 0.20 away from the manipulability range of the robot arm.